Method for producing a Gag-Env fusion protein

ABSTRACT

Disclosed is a substantially pure HIV antigen comprising a Gag-Env fusion otein consisting of a Gag peptide fused at its C-terminus to an Env peptide, wherein the Gag peptide comprises a contiguous sequence of at least ten amino acids of the amino acid sequence represented by Gag (308-437) and the Env peptide comprises a contiguous sequence of at least a part of the amino acid sequence represented by Env (512-699), the part containing at least one epitope which is reactive to an HIV antibody. The gag-env fusion DNA corresponding to the HIV antigen of the present invention allows the production of the desired high antigenicity HIV antigen in high yield. Therefore, the HIV antigen of the present invention can be advantageously used as an active component for a diagnostic reagent, a vaccine, an antibody preparation and a therapeutic reagent for AIDS. Also disclosed is a substantially pure HIV antigen comprising a Gag protein SEQ ID No.:1 coded for by the entire gag gene.

This application is a divisional of application Ser. No. 08/375,510 nowU.S. Pat. No. 5,576,421 filed on Jan. 18, 1995, which is a Rule 62continuation of abandoned application 07/985,949, filed on Dec. 4, 1992,the entire contents of which are hereby incorporated by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to a human immunodeficiency virus (HIV)antigen. More particularly, the present invention is concerned with asubstantially pure HIV antigen comprising a Gag-Env fusion proteinconsisting of a specific Gag peptide fused at its C-terminus to aspecific Env peptide, which antigen not only exhibits excellent HIVantigenicity, but which can also be obtained at a level that has neverbeen attained to date, and is also concerned with a method for producingthe same. The HIV antigen of the present invention is useful as anactive component for a diagnostic reagent, a vaccine, an antibodypreparation and a therapeutic reagent for AIDS (acquired immunedeficiency syndrome).

2. Discussion of Related Art

As is well known in the art, since the first AIDS patient was reportedin 1981, the number of AIDS patients has been increasing in geometricprogression. As of Apr. 1992, the total number of AIDS patients is aslarge as about 500,000. Although research on the prevention and medicaltreatment of the disease have been extensively and intensively madethroughout the world, no infallible preventive and therapeutic methodsare in practical use. The global spread of AIDS without any infalliblepreventive and therapeutic methods is now a world-shaking problem. Onthe other hand, the AIDS virus was first isolated and identified in1983, and since then, research on AIDS in both the basic and clinicalaspects has become active in the field of virology (see Nature, 326,435-436, 1987). As a result, remarkable progress has been made in thediagnosis of AIDS, and immunodiagnostic reagents for use in thediagnosis and methods for producing the same are rapidly being improved.AIDS viruses have been isolated from humans, monkeys and cats. Of them,the virus isolated from humans is designated "human immunodeficiencyvirus (HIV)". HIV is broadly classified into HIV-1 and HIV-2. HIV-1 isspreading worldwide, i.e., in the U.S.A., Europe, Central Africa andother numerous countries of the world, while HIV-2 is mainly spreadingonly in West Africa. HIV is a spherical virus of from 100 to 140 nm indiameter which has an envelope (Env). The Env is comprised oftransmembrane protein (gp4l) and 70 to 80 peplomers (gp120) which formrod-shaped protrusions, each having a diameter of 15 nm and a height ofabout 9 nm, and which are present in the surface of the viral particle.In the core of the viral particle, two single strand RNA molecules ofthe viral genome form a complex with reverse transcriptase andstructural proteins as the viral core, in which primer tRNA is present.The viral genome has a length of more than 9 kb and is comprised ofabout 10 different genes. Essentially, the viral genome is comprised ofthe following three major genes coding for the viral componentsessential for multiplication of the virus:

(1) gag (group-specific antigen) gene coding for p55 which is aprecursor protein of three types of structural proteins p17, p24, andp15 of the viral core;

(2) pol (polymerase) gene coding for a precursor of three differentenzymes, i.e., protease, reverse transcriptase, and integrase; and

(3) env (envelope) gene coding for gp160, which is a precursor of twotypes of glycoproteins, gp12O and gp41 forming the viral envelope.

These genes are arranged in the sequence of gag• pol . . . env in thedirection from the 5'-end toward the 3'-end of the viral genome.

The remaining approximately seven other genes, so-called accessorygenes, are believed to take part in the control of infection,multiplication, maturation of HIV, and the development of illness.

Various HIV antigens and enzymes essential for the basic studies of AIDSand for the development and production of therapeutic reagents,diagnostics and vaccines therefor can be produced by culturing HIV.However, the culturing of HIV is accompanied by the danger of fatalbiohazard. Therefore, various studies and attempts have been made todevelop a technique for the production of such antigens and enzymes inlarge quantity without culturing HIV. For example, with respect to boththe gag and pol genes, various HIV antigens and enzymes, such as Gagproteins p17, p24, and p15 (see Japanese Patent. Application Laid-OpenSpecification No. 4-117289) and pol gene products, e.g., protease,reverse transcriptase, and integrase (see Japanese Patent ApplicationLaid-Open Specification No. 2-265481), have been successfully producedin high yield by a technique capable of expressing the genes in E. coliand processing the produced protein, and some of the antigens andenzymes have been put to practical use.

On the other hand, with respect to the expression of the env gene ofHIV, the highly efficient expression of the env gene alone is extremelydifficult, as com- pared to that of the gag and pol genes, although thereason has not yet been elucidated. Therefore, in many cases, the envgene of HIV is expressed in a chimeric form with a foreign gene, therebyproducing the Env peptide as a protein in which the Env peptide is fusedto a foreign peptide. For example, it is known to express the env genein a chimeric form with a poliovirus gene, to thereby produce a fusionprotein in which the Env peptide is fused to a poliovirus antigen pep-tide (see Journal of Viroloqy, 65, 2875-2883, 1991). It is also known toexpress the env gene in a chimeric form with a gag gene by means of E.coli expression plasmid pEV-vrf, to thereby produce a Gag-Env fusionprotein in which the Env peptide is fused to the Gag peptide (AnalyticalBiochemistry, 161, 370-379, 1987). In these cases, a peptide coded forby a foreign gene or a structural gene, which is positioned downstreamof a promotor in an expression plasmid, is fused at its C-terminus tothe Env peptide. When the Env peptide is fused to a foreign peptide, theEnv peptide is likely to exhibit non-specificity in a reaction with testserum. Therefore, the Env peptide which is fused to a foreign peptide isinferior to a pure HIV antigen in quality and reliability for use as anHIV antigen. Furthermore, the Env peptide which is fused to a foreignpeptide is not good in terms of production yield.

It is conceivable to cleave a fusion protein at the site of the junctionof the Env peptide and a foreign peptide, in order to remove the foreignpeptide. However, by this method, it is not possible to obtain thedesired Env peptide in a foreign peptide-free, pure form in high yieldand at low cost.

On the other hand, it is known to express the Env peptide as a Gag-Envfusion protein consisting of a Gag peptide fused to the Env peptide (seeJapanese Patent Application Laid-Open Specification No. 1-179687;Viroloqy, 180, 811-813, 1991 and European Patent Application PublicationNo. 307149). However, the yield of the conventional Gag-Env fusionprotein is likely to be poor, thereby causing the production cost to behigh. Furthermore, such a Gag-Env fusion protein is likely to be poor inantigenicity, so that its reliability as an HIV antigen is low.

Thus, the conventional HIV antigens are disadvantageous in that they arepoor in quality, reliability and productivity. Therefore, a novel HIVantigen which is free from such problems has been much desired from apractical and commercial viewpoint, and the development of such a novelHIV antigen has been a task of great urgency in the art.

As mentioned above, the Env peptide has conventionally been produced inrelatively large quantity as a fusion protein in which the Env peptideis fused to a foreign peptide, but the conventionally obtained Envpeptide as a fusion protein is likely to exhibit non- specificity in anantigen-antibody reaction, so that the fusion protein is unsatisfactoryin quality and reliability for use as an HIV antigen. Such a fusionprotein is unsuitable for the practical diagnosis of AIDS. It shouldfurther be noted that the present invention has been attained byovercoming the serious problem of the prior art that even when the gaggene and env gene are fused to each other and expressed by conventionalgenetic engineering techniques, a Gag-Env fusion protein which is highlyreliable as an HIV antigen can never be produced in high yield.

SUMMARY OF THE INVENTION

The present inventors have made extensive and intensive studies with aview toward solving the above-mentioned problems by developing a novelHIV antigen. As a result, the present inventors have succeeded indeveloping a novel HIV antigen which is excellent in quality,reliability, and productivity, and hence is extremely advantageous froma practical and commercial viewpoint. Particularly, the presentinventors have unexpectedly found that a specific, substantially pureHIV antigen comprising a Gag-Env fusion protein (wherein the Gag-Envfusion protein consists of a Gag peptide fused at its C-terminus to anEnv peptide, and wherein the Gag peptide comprises a contiguous sequenceof at least ten amino acids of the amino acid sequence represented byGag (308-437) defined herein, and the Env peptide comprises at least apart of the amino acid sequence represented by Env (512-699) definedherein, the part containing at least one epitope which is reactive to anHIV antibody, not only exhibits excellent antigenicity, but can also beproduced in a yield which is so high as has convention- ally been unableto be attained.

Furthermore, in the process of designing the above-mentioned HIV antigenof the present invention comprising a specific Gag-Env fusion protein,the present inventors have unexpectedly found that as an HIV antigen,the Gag protein obtained by expressing the gag gene coding for theentire amino acid sequence of the Gag protein as an immature protein isadvantageous in that it not only exhibits a broad spectrum of reactivitywith HIV antibodies, but also exhibits strong reactivity inantigen-antibody reactions, as compared to the Gag proteins, p17, p24,and p15, which are mature proteins and have conventionally been known tobe useful as antigens for use in the diagnosis of AIDS.

Based on these novel findings, the present invention has been completed.

Therefore, it is an object of the present invention to provide asubstantially pure HIV antigen which is of high quality and which doesnot exhibit immunologically non-specific reactivity, and hence can beadvantageously used for producing a testing reagent for HIV, an HIVantibody, a reagent for the diagnosis of AIDS, a vaccine for AIDS, andthe like.

It is another object of the present invention to provide a method forproducing an HIV antigen in high yield and with high efficiency.

It is still another object of the present invention to provide arecombinant DNA molecule which is useful for production of an HIVantigen in high yield and with high efficiency.

It is a further object of the present invention to provide a reagent forthe diagnosis of AIDS.

It is still a further object of the present invention to provide avaccine for AIDS.

The foregoing and other objects, features and advantages of the presentinvention will be apparent from the following detailed description andappended claims taken in connection with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

In the accompanying drawings:

FIGS. 1A and 1B show the entire amino acid sequence of the Gag proteincoded for by the entire region of the gag gene contained in plasmidpNL4-3 containing the entire HIV-1 genome; and

FIGS. 2A, 2B, and 2C show the entire amino acid sequence of the Envprotein SEQ ID No.: 2 coded for by the entire region of the env genecontained in plasmid pNL4-3 containing the entire HIV-1 genome.

DETAILED DESCRIPTION OF THE INVENTION

In one aspect of the present invention, there is provided asubstantially pure HIV antigen comprising a Gag-Env fusion protein,wherein the Gag-Env fusion protein consists of a Gag peptide fused atits C-terminus to an Env peptide.

In the HIV antigen of the present invention, the Gag peptide comprises acontiguous sequence of at least ten amino acids of the amino acidsequence represented by Gag (308-437), wherein each of the numbersindicated in the parentheses is the positional amino acid number in theentire amino acid sequence of the Gag protein shown in FIGS. 1A and 1B.The Env peptide comprises at least a part of the amino acid sequencerepresented by Env (512-699), wherein the part contains at least oneepitope which is reactive to an HIV antibody, and wherein each of thenumbers indicated in the parentheses is the positional amino acid numberin the entire amino acid sequence of the Env protein of HIV shown inFIGS. 2A, 2B, and 2C.

In the present invention, the region of the amino acid sequence of eachof the various Gag peptides and the region of the amino acid sequence ofeach of the various Env peptides are indicated in parentheses asindicated above. With respect to the Gag peptide, each of the numbersindicated in the parentheses is the positional amino acid number in theentire amino acid sequence of the Gag protein of HIV shown in FIGS. 1Aand 1B and, with respect to the Env peptide, each of the numbersindicated in the parentheses is the positional amino acid number in theentire amino acid sequence of the Env protein of HIV shown in FIGS. 2A,2B and 2C.

In the present invention, unless otherwise specified, the left end andright end of the amino acid sequence of the peptide or protein are theN-terminus and C-terminus, respectively. In the amino acid sequence, Asprepresents an aspartic acid residue, Glu a glutamic acid residue, Lys alysine residue, Arg an arginine residue, His a histidine residue, Asn anasparagine residue, Gln a glutamine residue, Ser a serine residue, Thr athreonine residue, Tyr a tyrosine residue, Cys a cysteine residue, Trp atryptophan residue, Phe a phenylalanine residue, Gly a glycine residue,Ala an alanine residue, Val a valine residue, Leu a leucine residue, Ilean isoleucine residue, Pro a proline residue, and Met a methionineresidue.

In the HIV antigen of the present invention comprising the above-definedGag-Env fusion protein, it is preferred that the at least one epitopecontained in the part of the Env peptide be a contiguous sequence of atleast five amino acids of the amino acid sequence represented by Env(512-699).

In the present invention, the Gag peptide of the Gag-Env fusion proteincomprises a contiguous sequence of at least ten amino acids of the aminoacid sequence represented by Gag (308-437). The Gag peptide comprises acontiguous sequence of preferably at least 30 amino acids, morepreferably at least 50 amino acids, still more preferably at least 70amino acids of the amino acid sequence represented by Gag (308-437).Most preferably, the Gag peptide comprises an amino acid sequencerepresented by Gag (308-406) or Gag (308-437). In this connection, itshould be noted that when the Gag peptide of the Gag-Env fusion proteincontains an amino acid sequence positioned on the N-terminal side fromthe 307th amino acid of the entire amino acid sequence of the Gagprotein and/or an amino acid sequence positioned on the C-terminal sidefrom the 438th amino acid of the entire amino acid sequence of the Gagprotein, the production yield of the Gag-Env fusion protein becomesdisadvantageously lowered.

From the viewpoint of attaining improved antigenicity and productivityof the Gag-Env fusion protein, it is required that the Env peptidecomprise at least a part of the amino acid sequence represented by Env(512-699), which part contains at least one epitope which is reactive toan HIV antibody. The epitope of the Env peptide comprises a contiguoussequence of preferably at least 5 amino acids, more preferably at least10 amino acids, most preferably at least 15 amino acids of the aminoacid sequence represented by Env (512-699).

The Gag-Env fusion protein of the HIV antigen of the present inventioncan be produced, using genetic engineering techniques, by a method whichcomprises ligating an env gene coding for the above-mentioned specificEnv peptide containing at least one epitope of an HIV antigen to a gaggene coding for the above-mentioned specific Gag peptide downstream ofthe gag gene, to thereby obtain a recombinant DNA molecule comprising agag-env fusion gene, and expressing the gag-env fusion gene. Accordingto the present invention, by the expression of the above-mentionedspecific gag-env fusion gene, a novel HIV antigen, which has excellentantigenicity and therefore is effective for detecting HIV antibodieswith extremely high accuracy, has for the first time been produced. TheHIV antigen of the present invention is also advantageous in that theantigen can be provided in a yield which is so high as hasconventionally been unattainable.

The Gag-Env fusion protein of the HIV antigen of the present inventionreacts with all of the sera from HIV carriers tested and, therefore, theGag-Env fusion protein of the HIV antigen of the present invention isextremely useful not only as an antigen for producing a diagnosticreagent but also as an active ingredient for an HIV vaccine.

As mentioned above, there have been reported the usefulness of varioustypes of partial peptides of the Gag protein (see, for example, JapanesePatent Application Laid-Open Specification No. 4-117289). However, theusefulness of the entire Gag protein (p55), i.e., the entire amino acidsequence thereof, has not yet been reported. The reason why theusefulness of Gag protein p55 has not yet been reported resides in thefact that p55 is an extremely unstable immature protein which is formedin the course of the formation of HIV particles, and usually immediatelyundergoes processing to differentiate into mature proteins p17, p24, andp15. Conventionally, it has been totally inconceivable to use such animmature protein as an HIV antigen. As shown in step 2 of Example 2described later, the present inventors have produced p55, p17, p24, andp15 by recombinant DNA techniques, and made comparisons between p55,p117, p24, and p15 with respect to their reactivities with antibodies insera derived from HIV carriers. As a result, it has surprisingly beenfound that, among these proteins, p55 has the highest reactivity withHIV antibodies, and that the minimum quantity of p55 necessary fordetecting antibodies is smaller than those of the above-mentioned matureGag proteins. That is, Gag protein p55, even alone, can be used as aneffective antigen for diagnosis of AIDS. Furthermore, when Gag proteinp55 is used in the form of a mixture with the above-mentioned Gag-Envfusion protein, the reliability of reactivity with HIV anti- bodies isenhanced.

Accordingly, in another aspect of the present invention, there isprovided an HIV antigen, which comprises a mixture of the HIV antigencomprising the above-mentioned Gag-Env fusion protein and a Gag proteinin substantially isolated form comprising the amino acid sequencerepresented by Gag (1-500), which is the entire amino acid sequence ofthe Gag protein shown in FIGS. 1A and 1B wherein each of the numbersindicated in the parentheses is the positional amino acid number inFIGS. 1A and 1B.

Further, in still another aspect of the present invention, there isprovided a substantially pure HIV antigen, comprising a Gag protein insubstantially isolated form comprising the amino acid sequencerepresented by Gag (1-500), which is the entire amino acid sequence ofthe Gag protein shown in FIGS. 1A and 1B, wherein each of the numbersindicated in the parentheses is the positional amino acid number in FIGS1A and 1B.

The present invention is described below in more detail.

Essentially, the HIV antigens of the present invention can be preparedin accordance with the following schemes I to III.

Scheme I. Determination of a region of an Env protein having reactivitywith HIV antibody:

Various env gene fragments are individually fused to a highly expressinggene, such as the lacZ gene, downstream thereof so that various regionsof the env gene are individually expressed in a chimeric form with,e.g., the LacZ gene, thereby producing Env pep- tides as fusion proteinseach comprised of a respective Env peptide and a LacZ protein(β-galactosidase). Then, these expression products are subjected to animmunological reaction with a large number of sera from AIDS patients,asymptomatic HIV carriers (AC), and AIDS-related complex (ARC) patients,thereby identifying the epitope region of the Env protein, which epitoperegion is defined to exhibit a strong, specific reactivity with thesera.

Scheme II. Production of a Gag-Env fusion protein in large quantity andconfirmation of reactivity thereof with HIV antibodies:

With respect to various env gene fragments coding for the partial Envpeptides containing epitope regions identified in scheme I, above, andto various gag gene fragments coding for Gag peptides, the followingprocedure is performed in order to identify a Gag-Env fusion proteinwhich is desirable from the viewpoint of improving both productivity(yield) and antigenicity. Illustratively stated, with respect to variouscombinations of env gene fragments and gag gene fragments, an env genefragment is fused to a gag gene downstream thereof, and, under thecontrol of a promoter, the env gene is expressed in a chimeric form withthe gag gene, thereby producing an Env peptide as a Gag-Env fusionprotein.

With respect to the Gag-Env fusion protein which is produced in thelargest quantity, the reactivity of the Gag-Env fusion protein with HIVantibodies is determined.

Scheme III. Confirmation of reactivity of Gag protein p55 having theentire amino acid sequence of a Gag protein:

In the same manner as in scheme II above, Gag protein p55 and Gagpeptides p17, p24, and p15 are produced and individually subjected to animmunological reaction with sera from HIV carriers, and the reactivitiesof p55, p17, p24, and p15 with the sera are compared with one another,thereby confirming that p55 is the most active with respect to bothreactivity and HIV detection ratio.

The techniques required for practicing the above-mentioned schemes I toIII will now be described.

(1) Preparation of cDNA fragments containing the gag gene and/or envgene of HIV:

The entire region or a fragment of each of the gag gene and the env genecan be used. The entire region or fragment is inserted into a vector forhigh expression so that the reading frame of the insert matches withthat of the vector. Since the HIV genome consists of RNA, when geneexpression using recombinant DNA techniques is conducted, it isnecessary that the above-mentioned HIV genes be converted to cDNAfragments complementary thereto. The cDNA fragments can be prepared fromthe provirus genome which is integrated into a chromosome of a host cellor from a cloned extrachromosomal circular DNA. Alternatively, the cDNAfragments can be screened from a cDNA library which has been constructedby a conventional method using reverse transcriptase, and using as atemplate the RNA genome extracted from virus particles of HIV. However,in the above-mentioned methods for the preparation of cDNA fragments, itis necessary to directly handle highly dangerous HIV. From the viewpointof biohazard prevention, it is not preferable to directly handle HIV.Accordingly, in order to not only avoid the biohazard problems caused byHIV, but also to save labor in the preparation of cDNA fragments, it isrecommended that known and established cDNA clones of HIV gene be used.With respect to operations for studying HIV genes, such as gene cloning,preparation of a restriction map and determination of nucleotidesequences, many reports have been issued by researchers around theworld. For assuring safety and efficiency, it is desirable to utilizethe results of these published studies. For example, use can be made ofplasmids, such as pNL3-1, pNL3-2 and pNL4-3, all of which are genomicclones of HIV-1 provirus and available from the National Institutes ofHealth, U.S.A. (with respect to pNL3-1 and pNL3-2, see Journal ofVirology, 59, 284-291, 1986; and with respect to pNL4-3, see GenBankdata file HIVNL43). Further, using plasmid pNL4-3, microorganismscontaining various plasmids carrying a partial region of the HIV-1genome have been prepared and deposited. Illustratively stated, thesedeposited microorganisms are E. coli JM109/pCV91, containing plasmidpCV91 having a central region of a gag-pol fusion gene (deposited at theFermentation Research Institute, Japan under accession number FERMBP-3195), E. coli JM109/pNLH122, containing plasmid pNLH122 having the5' half of a gag gene (deposited at the Fermentation Research Institute,Japan under accession number FERM BP-3196), E. coli JM109/pTG581,containing plasmid pTG581 having the entire region of the gag gene(deposited at the Fermentation Research Institute, Japan under accessionnumber FERM BP-3927), E. coli JM109/pNS210, containing plasmid pNS210having the entire region of the env gene (deposited at the FermentationResearch Institute, Japan under accession number FERM BP-3920), E. coliJM109/pTE192, containing plasmid pTE192 having a cDNA coding for the Env(512-611) region of the Env protein (deposited at the FermentationResearch Institute, Japan under accession number FERM BP-3925), and E.coli JM109/pGE33, containing plasmid pGE33 having a cDNA coding for aGag-Env fusion protein consisting of Gag(308-406) and Env(512-611)(deposited at the Fermentation Research Institute, Japan under accessionnumber FERM BP-3923). Preparation of cDNA fragments from these clonescan be performed according to conventional methods. For example, adesired DNA fragment is cleaved out from the above-mentioned clones bymeans of restriction enzymes and purified by the technique of phenolextraction, chloroform treatment, ethanol precipitation, or the like.The restriction enzymes to be used for cleaving DNA can be appropriatelychosen, based on the restriction maps of the individual clones.

(2) Construction of plasmids for expression of the gag gene, env gene,and gag-env fusion gene, and preparation of transformants having suchplasmids inserted therein:

The HIV gene cDNA fragment prepared according to the above procedure isfused to a highly expressing gene on a plasmid or a vector for directlyexpressing a cloned gene according to conventional methods, e.g., by theuse of T4 DNA ligase, to thereby construct an HIV gene expressionplasmid. In the present invention, the term "plasmid" is employed as aconvenient indication, and in substance, broadly means a replicon whichexpresses the HIV gene.

Accordingly, for constructing such expression plasmids, conventional andcommercially available vectors for expression can be employed. Examplesof suitable vectors include the plasmid vector pSN508 series of theenterobacteria family (see U.S. Pat. No. 4,703,005), plasmid vectorpJM105 from yeast (see Japanese Patent Application Laid-OpenSpecification No. 62-286930), plasmid vector pBH103 series from yeast(see Japanese Patent Application Laid-Open Specification No. 63-22098),attenuated varicella virus vector (see Japanese Patent ApplicationLaid-Open Specification No. 53-41202), attenuated Marek's disease virusvector (see European Patent Application Publication No. 334530), plasmidvectors from Escherichia coli, such as the pUR290 series, includingpUR290, 291, and 292 (see EMBO Journal, 2, 1791-1794, 1983), pSN5182(see Journal of Bacterioloqy, 157, 909-917, 1984), and the pT7 series(see Proceedings of the National Academy of Sciences USA 82, 1074-1078,1985).

In constructing an expression vector, it is important to insert andligate the above gene under the control of a strong promoter and to fusethe gene with a gene ensuring expression in large quantity. For example,when the above pUR290 series vector is used, it is preferred that theabove gene be fused thereto downstream of the lacZ gene. When pSN5182 isused, the gene is preferably fused downstream of the pstS gene. WhenpT7-7 of the above pT7 series is used, the gene is preferably clonedinto a multicloning site downstream of the T7 promoter.

pT7-7 is especially suitable for the expression of the gag gene, envgene, and gag-env fusion gene of HIV, and the T7 promoter thereof is anextremely strong promoter. Therefore, it is especially preferred to usepT7-7 in the present invention.

In the insertion and ligation of the gene, it is requisite that theplasmid cleaved with a restriction enzyme be pretreated by BAP(bacterial alkaline phosphatase) to remove a phosphate group, therebypreventing self-ligation thereof, and that the reading frame of the geneon the plasmid and that of the inserted gene be arranged to match witheach other in order to ensure efficient translation. That is, theexpression of the HIV gene in large quantity is guaranteed by insertingthe HIV gene into a highly expressing gene in a manner such that thereading frame of the HIV gene matches with that of the highly expressinggene. The above-mentioned matching of reading frames can be attained byconventional methods using enzymes, such as restriction enzymes,nuclease Ba131and mung been nuclease.

A suitable host cell, into which the above constructed expression vectoris to be introduced in order to obtain a transformant, should beselected from sensitive host cells which permit replication andexpression of the genes on the expression vector and, especially, fromcells which allow the constructed expression vector to be easilyintroduced thereinto and to be easily detected. For example, when theabove-mentioned pSN series vector is used as an expression vector,Escherichia coli strain C75 (deposited at the Fermentation ResearchInstitute, Japan under accession number 10191) is preferably employed asthe host bacterium, because the transformant obtained by the insertionof the above vector can be screened using its drug resistance as amarker. When pUR290 series and pT7 series vectors are employed, use ismade of Escherichia coli strain UT481 (see Journal of Bacteriology,163(1), 376-384, 1985), Escherichia coli strain BL21 (DE3) (see Journalof Molecular Biology, 189(1), 113-130, 1986), Escherichia coli strainJM109 (DE3) (see Journal of Molecular Biology, 189(1), 113-130, 1986;and Gene, 33(1), 103-119, 1985), and Escherichia coli strain JM103 (seeNucleic Acids Research, 9, 309-321, 1981). These are preferably usedbecause the transformant obtained by the introduction of the vectors canbe screened using ampicillin resistance as a marker.

The introduction of an expression vector into such host cells asmentioned above can be carried out by conventional methods, such as themethod using potassium chloride (see Journal of Molecular Bioloqy, 53,154-162, 1970). The transformants having, introduced therein, anexpression plasmid carrying the gag gene, env gene, or gag-env fusiongene are screened from colonies which are positive for theabove-mentioned marker. Subsequently, the expression vector DNA isextracted from the screened transformant colonies, digested with arestriction enzyme and then subjected to agarose gel electrophoresis todetermine the size of the inserted DNA fragment. The colony in which thepresence of the DNA fragment of the gene has been confirmed is employedas a transformant clone for the expression of the HIV gene.

(3) Production of a LacZ (β-galactosidase)-Env fusion protein in largequantity:

According to the procedure shown in item (2) above, the expression of aLacZ-Env fusion protein in large quantity can be conducted. For example,the large-quantity expression of fusion proteins can be performed bycloning env gene fragments (shown in Table 1), which code for a varietyof Env peptides shown in item (5) below, into pUR290, pUR291, or pUR292.With respect to the LacZ-Env fusion protein, it is possible that theβ-galactosidase (LacZ) may react with sera from some asymptomatic HIVcarriers and non-infected humans, thereby exhibiting false positivity.However, for example, if test sera are pretreated so as to preadsorbanti-LacZ antibodies in the sera with LacZ protein or the LacZ moiety ofthe fusion protein is masked with anti-LacZ antibodies, according toconventional methods, the above-mentioned false positive reaction can besuppressed, thereby allowing the use of the LacZ-Env fusion protein asan antigen for diagnosis of AIDS.

(4) Confirmation of the expression of the gag gene, env gene, andgag-env fusion gene in transformant clones:

The confirmation of gene expression by transform- ant clones obtained initem (2) above can be carried out by analyzing a crude extract oftransformant clones by a conventional method, such as polyacrylamide gelelectrophoresis (PAGE) and Western blotting. The crude extract can beprepared by a method in which after culturing transformants in aconventional medium, the bacterial cells are collected by low-speedcentrifugation and then are treated with sodium dodecyl sulfate (SDS)and 2-mercaptoethanol, followed by high-speed centrifugation to therebycollect the supernatant. The supernatant is subjected to SDS-PAGE tothereby fractionate it into protein bands. The fractionated bands arestained with CBB (Coommassie Brilliant Blue) to thereby confirm whetheror not large-quantity expression has been attained. When the Westernblotting method is employed, the confirmation of the large-quantityexpression can be made by the following procedure according toconventional methods using materials selected from a wide variety ofcommercially available materials: The above-mentioned crude extract issubjected to SDS-PAGE. The resultant fractionated protein bands aretransferred onto a nitrocellulose membrane or a polyvinylidenedifluoride membrane by the use of a transblotting cell. The membrane isimmersed in a gelatin solution or a skim milk solution, thereby blockingthe membrane. Thereafter, for example, when the samples on the membraneto be examined are gene expression products of HIV, they are subjectedto a primary reaction with serum from asymptomatic HIV carriers. Then,after rinsing the samples, they are further subjected to a secondaryreaction with a peroxidase -conjugated anti-human IgG antibody. Then,after rinsing the samples, they are subjected to coloring, using ahydrogen peroxide solution and a coloring agent, to detect bands whichspecifically react with sera from HIV carriers, thereby confirming theexpression of the gag gene, env gene, and gag-env fusion gene of HIV inthe above-mentioned clones.

(5) Determination of a partial region of an Env pep- tide containingepitopes reactive with HIV antibodies:

The determination can be achieved, for example, utilizing the reactivityof the LacZ-Env fusion protein described in item (3) above, by theWestern blotting method described in item (4) above. According to thismethod, it has been found that with respect to the Env protein of theentire amino acid sequence shown in FIGS. 2A, 2B, and 2C partial regionswhich are reactive with HIV antibodies include those having thefollowing amino acid sequences:

Env(14-244), Env(14-437),

Env(14-611), Env(175-363),

Env(224-510), Env(244-611),

Env(244-434), Env(244-437),

Env(244-772), Env(244-826),

Env(437-510), Env(437-611),

Env(437-722), Env(437-826),

Env(512-611), Env(512-699),

Env(610-722), Env(610-826), and

Env(721-826).

Among the above-mentioned amino acid sequences, the following amino acidsequences, which exhibit especially strong reactivity with HIVantibodies, are identified as containing epitopes of the Env protein:

Env(14-244), Env(244-434),

Env(244-510), Env(512-611),

Env(512-699), Env(610-722), and

Env(721-826).

In the present invention, as mentioned above, among the epitope regionsof these 7 amino acid sequences, at least a part of the amino acidsequence represented by Env(512-699), which part contains at least oneepitope reactive with an HIV antibody, is employed from the viewpoint ofattaining excellent antigenicity and productivity of a Gag-Env fusionprotein as an HIV antigen of the present invention. The epitopecontained in the part of the Env peptide is preferably a contiguoussequence of at least 5 amino acids of the amino acid sequencerepresented by Env(512-699).

(6) Determination of a Gag peptide which is preferred for theconstruction of a Gag-Env fusion protein:

In a Gag-Env fusion protein, a Gag peptide is used, instead of LacZ, asa carrier effective for producing an Env peptide in high yield.Therefore, with respect to a Gag peptide, the productivity is moreimportant than the antigenicity. Therefore, after constructing anexpression vector for a gag gene according to, for example, theprocedure described in item (2) above, the productivity of a Gag peptideis determined by SDS-PAGE in the same manner as in item (4) above.

It has been found that Gag peptides, which can be successfully used forproducing a Gag-Env fusion protein in high yield, are represented by thefollowing amino acid sequences:

Gag(1-119), Gag(1-132), Gag(1-154),

Gag(1-210), Gag(1-309), Gag(1-405),

Gag(1-406), Gag(1-437), Gag(1-500),

Gag(121-405), Gag(121-406),

Gag(121-437), Gag(308-405),

Gag(308-406), Gag(308-435),

Gag(308-436), Gag(308-437), and

Gag(308-500).

Of these, Gag(308-437) is one of the amino acid sequences which are mosthighly accumulated in E. coli cells, and hence is useful for assuringthe high yield of a Gag-Env fusion protein in the present invention. Asmentioned above, in the present invention, the Gag peptide of theGag-Env fusion protein comprises a contiguous sequence of at least 10amino acids, preferably at least 30 amino acids, more preferably atleast 50 amino acids, still more preferably at least 70 amino acids ofthe amino acid sequence represented by Gag(308-437). As a Gag peptide ofthe Gag-Env fusion protein, most preferably employed is a peptide havingan amino acid sequence represented by Gag(308-406) or Gag(308-437).

(7) Production of a Gag-Env fusion protein:

A preferred Gag-Env fusion protein is composed of a Gag peptide selectedin item (6) above and an Env peptide selected in item (5) above. Aplasmid expressing such a fusion protein can be constructed, forexample, by inserting a gene coding for an Env peptide described in item(5) above to a plasmid expressing a Gag peptide alone, which Gag peptideis described in item (6) above, or a gag gene-containing plasmidprepared from, e.g., a gag-pol fusion gene, according to the proceduredescribed in items (1) and (2) above. The fusion of a gag gene and anenv gene is performed in a fashion such that the reading frame of theinserted env gene matches that of the gag gene on the expressionplasmid. This can be achieved by conventional methods employing enzymes,such as restriction enzymes, nuclease Bal3l and mung been nuclease. Inthis operation, a Gag peptide and an Env peptide may be fused togetherthrough a junction consisting of several amino acid residues. It is wellknown in the art that such a junction is generally incorporated intofusion proteins as a result of the above operation. It does notadversely affect the antigenicity of the Gag-Env fusion protein of thepresent invention. Accordingly, the expression "Gag-Env fusion proteinwhich consists of a Gag peptide and an Env peptide" and expressionssimilar thereto should be interpreted to include a Gag-Env fusionprotein which consists of a Gag peptide, an Env peptide and a junction,if any, present therebetween. The confirmation of large-quantityexpression can be carried out in the same manner as described in item(4) above.

(8) Production of a Gag protein or a Gag-Env fusion protein by culturinga transformant which has been confirmed with respect to the expressionof the gag gene or gag-env fusion gene:

For example, the following steps can be taken. For preparing atransformant seed to be cultured for the large-quantity production of aprotein, when the transformant is, for example, E. coli, it is culturedat 30° to 40°C. for 12 to 35 hours in LB medium until the cell densityof E. coli reaches 2×10⁹ to 8 ×10⁹ cells/ml. Subsequently, 1 to 10liters of the seed are inoculated into 1000 liters of fresh LB medium,followed by two-stage culturing consisting of preculturing andpostculturing. The purpose of the preculturing is to proliferate seedcells and replicate the expression vector, and the preculturing iscarried out at 10° to 40° C. for 1 to 24 hours, preferably 15° to 37° C.for 2 to 12 hours. For example, in the case of E. coli, the preculturingis discontinued when the cell density of E. coli has reached anOD_(600nm) of 0.1 to 2.0. After the termination of the preculturing, theresultant culture is subjected to postculturing. The postculturing is tobe performed under strictly con- trolled conditions under which thetranscription and translation of a gene cloned into an expression vectorare insured and, simultaneously, random decomposition and inactivationof gene products produced by translation, by proteolytic enzymes presentin host cells, can be avoided. The postculturing is preferably carriedout at a temperature lower than that of the preculturing. Thepostculturing may be performed at 10° to 40° C. for 1 to 40 hours,preferably at 15° to 37° C. for 3 to 35 hours. Further, taking intoconsideration the properties of the expression vector used, in order topromote and induce the expression, starvation of phosphate in theculture medium, addition of an inducer such as IPTG (isopropylβ-D-thiogalactopyranoside)! to the culture, and the like, can beconducted at the beginning of the postculturing. By carrying out theabove two-stage culturing, the Gag protein or the Gag- Env fusionprotein are generally produced in a yield of about 1 to 50 mg per literof the culture. Among various combinations of Gag peptides and Envpeptides as mentioned hereinbefore, the fusion proteins which can beproduced in high yield and exhibit especially high antigenicity areGag-Env fusion proteins respectively having amino acid sequencesrepresented by: Gag(308-406)-Env(512-611), Gag(308-437)-Env(512-611),and Gag(308-406)-Env(512-699).

(9) Purification of the Gag protein and Gag-Env fusion protein whichhave been produced in high yield:

This can be achieved by employing conventional methods in combination.For example, purification of proteins can be carried out by anappropriate combination of the following methods: (a) collection oftransformed cells by the use of a precipitant, centrifugation,filtration, etc.; (b) preparation of a crude extract by disruptingtransformed cells by the use of ultrasonic treatment, pressure/vacuumtreatment, a homogenizer, etc.; (c) purification by adsorption anddesorption with silicic acid or an activated carbon, salting out,precipitation from an organic solvent, etc., as well as high degree ofpurification by fractionation employing ultracentrifugation, columnchromatography, electrophoresis, etc.; and (d) purification byadsorption and desorption with silicic acid or activated carbon andfractionation by density gradient centrifugation (see Japanese PatentApplication Laid- Open Specification No. 63-297).

Accordingly, in still another aspect of the present invention, there isprovided a method for producing a substantially pure HIV antigencomprising a Gag-Env fusion protein, wherein the Gag-Env fusion proteinconsists of a Gag peptide fused at its C- terminus to an Env peptide,which comprises:

(a) ligating a first deoxyribonucleic acid sequence to a replicableexpression vector to obtain a first recombinant DNA molecule capable ofreplication in a host cell and comprising said expression vector andsaid first deoxyribonucleic acid sequence inserted therein, the firstdeoxyribonucleic acid sequence coding for a Gag peptide comprising acontiguous sequence of at least ten amino acids of the amino acidsequence represented by Gag (308-437), wherein each of the numbersindicated in the parentheses is the positional amino acid number in theentire amino acid sequence of the Gag protein shown in FIGS. 1-(1)through 1-(2);

(b) ligating a second deoxyribonucleic acid sequence to the firstrecombinant DNA molecule downstream of the first deoxyribonucleic acidsequence, so that the second sequence is fused to the first sequence,the second deoxyribonucleic acid sequence coding for an Env peptidecomprising at least a part of the amino acid sequence represented by Env(512-699), the part containing at least one epitope which is reactivewith an HIV antibody, wherein each of the numbers indicated in theparentheses is the positional amino acid number in the entire amino acidsequence of the Env protein shown in FIGS. 2A, 2B, and 2C,

thereby obtaining a second recombinant DNA molecule capable ofreplication in a host cell and comprising the expression vector, thefirst deoxyribonucleic acid sequence, and the second deoxyribonucleicacid sequence fused downstream of the first sequence;

(c) transforming prokaryotic or eukaryotic cells with the secondrecombinant DNA molecule to produce transformants;

(d) selecting the transformants from untransformed prokaryotic oreukaryotic cells;

(e) culturing the transformants to produce an HIV antigen comprising aGag-Env fusion protein, wherein the Gag-Env fusion protein consists ofthe Gag peptide fused at its C-terminus to the Env peptide; and

(f) isolating the HIV antigen comprising the Gag-Env fusion protein fromthe cultured transformants.

It is preferred that the second deoxyribonucleic acid code for acontiguous sequence of at least five amino acids of the amino acidsequence represented by Env (512-699).

The above-mentioned first deoxyribonucleic acid codes for a Gag peptidecomprising a contiguous sequence of at least ten amino acids of theamino acid sequence represented by Gag (308-437). The firstdeoxyribonucleic acid codes for a Gag peptide comprising a contiguoussequence of preferably at least 30 amino acids, more preferably at least50 amino acids, still more preferably at least 70 amino acids of theamino acid sequence represented by Gag (308-437). Most preferably, thefirst deoxyribonucleic acid codes for a Gag peptide comprising the aminoacid sequence represented by Gag (308-406) or Gag (308-437).

From the viewpoint of attaining improved antigenicity and productivityof the Gag-Env fusion protein, it is required that the seconddeoxyribonucleic acid sequence code for an Env peptide comprising atleast a part of the amino acid sequence represented by Env (512-699),which part contains at least one epitope which is reactive to an HIVantibody. The second deoxyribonucleic acid codes for a contiguoussequence of preferably at least five amino acids, more preferably atleast ten amino acids, most preferably at least 15 amino acids of theamino acid sequence represented by Env (512-699).

It is preferred that the expression vector be derived from the pT7series or the pUR290 series vectors. Most preferably, the expressionvector is pT7-7.

The expression vector obtained in the method of the present inventionmay be provided in a form contained in a sealed small vessel, such as anampule or a vial, or in a form incorporated into a host cell.

In a further aspect of the present invention, there is provided arecombinant DNA molecule capable of replication in a host cell,comprising a replicable expression vector having inserted therein afirst deoxyribonucleic acid sequence, and a second deoxyribonucleic acidsequence fused downstream of the first sequence,

the first deoxyribonucleic acid sequence coding for a Gag peptidecomprising a contiguous sequence of at least ten amino acids of theamino acid sequence represented by Gag (308-437), wherein each of thenumbers indicated in the parentheses is the positional amino acid numberin the entire amino acid sequence of the Gag protein shown in FIGS. 1Aand 1B, and

the second deoxyribonucleic acid sequence coding for an Env peptidecomprising at least a part of the amino acid sequence represented by Env(512-699), the part containing at least one epitope which is reactive toan HIV antibody, wherein each of the numbers indicated in theparentheses is the positional amino acid number in the entire amino acidsequence of the Env protein of HIV shown in FIGS. 2A, 2B, and 2C.

It is preferred that the second deoxyribonucleic acid sequence code fora contiguous sequence of at least five amino acids of the amino acidsequence represented by Env (512-699).

As more preferred forms of the first deoxyribonucleic acid sequence andsecond deoxyribonucleic acid sequence of the recombinant DNA molecule ofthe present invention, those which are described above in connectionwith the method for producing the Gag-Env fusion protein of the HIVantigen of the present invention can be used.

It is preferred that the expression vector be selected from those whichare derived from the pT7 series or the pUR290 series vectors. Mostpreferably, the expression vector is pT7-7.

The Gag protein and Gag-Env fusion protein produced in large quantity byusing the recombinant DNA molecule of the present invention may becharged and sealed in a small vessel, such as an ampule or a vial, inthe form of a liquid or dried powder, or in a form adsorbed on a filteror membrane. When the antigen of the present invention is in a liquidform, a predetermined volume can be taken out and used. When the antigenis in a dried form, the antigen is dissolved in distilled water forreconstitution thereof so that the volume becomes the original volumebefore being subjected to drying and then, a predetermined volume can betaken and used. When the antigen is in an adsorbed form on a filter ormembrane, the antigen is hydrated with an appropriate solution, andused.

In still a further aspect of the present invention, there is provided areagent for diagnosis of acquired immune deficiency syndrome by animmunological reaction, comprising an immunological reaction effectiveamount of the HIV antigen of the present invention comprising a Gag-Envfusion protein and/or a Gag protein.

In still a further aspect of the present invention, there is provided avaccine for acquired immune deficiency syndrome, comprising an effectiveimmunogenic amount of the HIV antigen of the present inventioncomprising a Gag-Env fusion protein and/or a Gag protein and at leastone pharmaceutically acceptable adjuvant, diluent, or excipient.

The dose of the vaccine for adults at one administration may generallybe about 0.001 to 1000 μg.

The present invention will now be described in more detail withreference to the following Examples, which should not be construded tolimit the scope of the present invention.

PREFERRED EMBODIMENT OF THE INVENTION

Example 1

Step 1 (Construction of plasmids capable of expressing LacZ-Env fusionproteins)

HIV-1 provirus DNA clone pNL4-3 (see Journal of virology, 59, 284-291,1986; GenBank data file HIVNL43; which clone pNL4-3 is available fromthe National Institutes of Health, U.S.A.) is digested with EcoRI andXhoI and then subjected to agarose gel electrophoresis to thereby obtaina DNA fragment of 3.1 kb nucleotide number 5743-8887 according toGenBank data file HIVNL43!. The obtained DNA fragment is cloned intoplasmid pHSG398 which has been digested with EcoRI and SalI and treatedwith BAP, to thereby obtain plasmid pNS210. The obtained plasmid pNS210is digested with KpnI and then subjected to agarose gel electrophoresisto thereby obtain a DNA fragment of 2.55 kb. The collected DNA fragmentis digested with HaeIII and then subjected to agarose gelelectrophoresis to thereby obtain a HaeIII DNA fragment of about 570 bnucleotide number 7834-8400 according to GenBank data file HIVNL43!. Theobtained DNA fragment is cloned into plasmid pUC9 which has beendigested with HincII and treated with BAP, to thereby obtain plasmidpEH22. The obtained plasmid pEH22 is digested with BamHI and PstI toobtain a DNA fragment of about 580 b, and the obtained DNA fragment iscloned into plasmid pUR292 (see EMBO Journal, 2, 1791-1794, 1983) whichhas been cleaved with BamHI and PstI, to thereby obtain plasmid pAS182(see Table 1). The obtained plasmid pAS182 is digested with HindIII andthen self ligated to thereby obtain plasmid pAS192 (see Table 1). Theobtained plasmids pAS182 and pAS192 express LacZ-Env (512-699) andLacZ-Env (512-611) fusion proteins, respectively. In addition to theabove plasmids, 15 other types of plasmids which express various typesof LacZ-Env fusion proteins are constructed (see Table 1).

Step 2 (Large-quantity production of LacZ-Env fusion proteins)

17 types of expression vectors shown in Table 1 including plasmidspAS182 and pAS192 are individually introduced into E. coli strain JM 103(see Nucleic Acids Research, 9, 309-321, 1981). The resultant E. colitransformants are individually inoculated into 2 ml of LB mediumcontaining 20 μg/ml of ampicillin and incubated at 37° C. overnight withshaking, to obtain cultures. Then, 0.05 to 0.1 ml of each of thecultures is inoculated into 5 ml of LB medium containing 20 μg/ml ofampicillin and then incubated at 37° C. with shaking. When the celldensity reaches an OD_(600nm) of 0.5, IPTG is added to the culture to afinal concentration of 1 mM to thereby induce expression of fusionproteins. The culture is incubated at 37° C. for 5 hours with shakingand then the E. coli cells are harvested from 1.5 ml of the culture bycentrifugation. The harvested cells are suspended in 120 μl of 20 mMTris-HCl (pH 7.5) to thereby obtain a suspension. To the obtainedsuspension is added 60 μl of SDS-PAGE sample buffer, and mixed well. Themixture is heated at 100° C. for 3 minutes and centrifuged at 12,000 rpmfor 5 minutes, to thereby obtain a supernatant. 7.5 il of the obtainedsupernatant is applied to an SDS-PAGE gel to thereby attainfractionation. The gel is stained with CBB to confirm production offusion proteins. Thus, large-quantity production of 17 types of LacZ-Envfusion proteins is confirmed.

Step 3 (Identification of epitope regions which are recognized by Envantibodies in sera of HIV carriers) The total proteins of E. coli strainJM103 which has been used in the large-quantity production of 17 typesof LacZ-Env fusion proteins (see Table 1) and LacZ protein arefractionated by SDS-PAGE in substantially the same manner as in Step 2,and electroblotted to a polyvinylidene difluoride membrane. The blotsare blocked with skim milk (available from Difco Laboratories, U.S.A.),and individually reacted with each of sera A, B, and C, separately,which have been taken from three HIV carriers (asymptomatic HIVcarriers) and diluted to 100-fold with a buffer containing 20 mMTris-HCl (pH 7.5), 150 mM NaCl, and 0.05% Tween 20. Before the use ofthe sera, it has been confirmed that none of the sera reacts with LacZ.As a secondary antibody, use is made of a peroxidase-conjugated goatanti-human IgG (Bio-Rad Laboratories, U.S.A.). By analyzing the results(see Table 2) of the Western blotting, two, three, and five epitoperegions recognized by Env antibodies contained in the sera A, B, and Care identified, respectively (see Table 3). The fusion proteins thatreact with all of sera A, B, and C are LacZ-Env fusion proteinscontaining an amino acid sequence of Env (512-611) and/or an amino acidsequence of Env (721-826).

Step 4 (Evaluation of LacZ-Env (512-611) and LacZ-Env (721-826) fusionproteins as antigens for diagnosis)

For confirming the usefulness of LacZ-Env (512-611) and LacZ-Env(721-826) fusion proteins as antigens for diagnosis, Western blotting isconducted in substantially the same manner as in Step 3 using sera from41 HIV carriers (in particular, 36 asymptomatic HIV carriers, 1 ARC and4 AIDS patients). The results of Western blotting are shown in Table 4,together with those of 3 asymptomatic HIV carriers of Step 3. LacZ-Env(512-611) reacts with all of the sera from 44 HIV carriers (100%). Onthe other hand, LacZ-Env (721-826) reacts with only 35 out of 44 HIVcarriers (79%). Therefore, Env (512-611) is considered to be useful asan antigen for diagnosis. The Env (721-826) region cannot beindependently used for diagnosis, but it would be useful in combinationwith other antigens, for example, the Env (512-611) region. However,sera from 2 out of 39 asymptommatic HIV carriers weakly react with LacZ(β-galactosidase) and, therefore, it would be undesirable to use theLacZ-Env fusion protein as it is for diagnostic purposes. However, theLacZ-Env fusion protein can be used for diagnostic purposes if test seraare pretreated so as to preadsorb anti-LacZ anti- bodies in the serawith LacZ protein according to the customary method, as mentionedhereinbefore.

Step 5 (Construction of plasmids capable of expressing Env proteinsunder the control of the T7 promoter)

Synthetic oligonucleotides 5'TATGGCTAAG 3 '(SEQ ID No.: 3 )and5'AATTCTTAGCCA 3'(SEQ ID No.: 4 )are annealed, and inserted into plasmidpT7-7, a plasmid of the pT7 series (see Proceedings of the NationalAcademy of Sciences USA, 82, 1074-1078, 1985), having been digested withNdeI and EcoRI, to thereby obtain plasmid pT7-7-1. The plasmid pT7-7-1is the plasmid having a one nucleotide insertion of an adenine residue(A) between the NdeI site and the EcoRI site being multicloning sites ofpT7-7. The plasmid pT7-7-1 is digested with BamHI and PstI, and thefragment of about 580 b obtained by digesting plasmid pEH22 (see Step 1)with BamHI and PstI is cloned thereinto to thereby obtain plasmid pTE182(see Table 5). Subsequently, the thus obtained plasmid pTE182 isdigested with HindIII, and then self ligated to thereby obtain plasmidpTE192 (see Table 5). Plasmid pNS210 (see Step 1) is digested with NdeI,and further partially digested with BglII to thereby obtain anNdeI-BglII fragment (nucleotide number 6399-7611 according to GenBankdata file HVNL43). The thus obtained NdeI-BglII fragment is cloned intoplasmid pUR292 (see Step 1) having been digested with NdeI and BamHI tothereby obtain plasmid pNB21. The obtained plasmid pNB21 is digestedwith BglII and ClaI to thereby obtain a fragment of about 0.6 kb. Theobtained fragment is cloned into plasmid pT7-7 having been digested withBamHI and ClaI to thereby obtain plasmid pTE311 (see Table 5). Theobtained plasmids pTE182, pTE192, and pTE311 express Env (512-699), Env(512-611), and Env (244-437), respectively. Plasmids capable ofexpressing an Env protein under the control of the T7 promoter are shownin Table 5. E. coli strain BL21 (DE3) is used as a host for expression.The culturing of E. coli cells and the analysis of proteins areconducted in substantially the same manner as described in Steps 2 and3. The proportion of the Env proteins expressed by plasmids pTE182,pTE192, and pTE311 to the total cell proteins is as small as only about1 to 2 %. This shows that even if a plasmid is chosen, it is difficultto express an Env protein alone in a practically acceptable yield.

Step 6 (Construction of plasmids capable of expressing Gag proteinsunder the control of the T7 promoter)

Plasmid pTG591 (see Japanese Patent Application Laid-Open SpecificationNo. 4-117289) is digested with NdeI and BclI to obtain a fragment ofabout 1.6 kb. This fragment is cloned into plasmids pT7-7 and pTE-3a(see Methods in Enzymology, 185, 60-89, 1990) each having been digestedwith NdeI and BamHI, to thereby obtain plasmids pTG581 and pEG581 (seeTable 6), respectively. These plasmids express the gag gene (p55).

Plasmids pTG210, pTGl10, and pTG591 (see Japanese Patent ApplicationLaid-Open Specification No. 4- 117289) are individually digested withApaI and ClaI, treated with T4DNA polymerase, and self ligated, tothereby obtain plasmids pTG210-2, pTG110-2, and pTG561 (see Table 6),respectively. These plasmids are, respectively, capable of expressingGag (308-405), Gag (121-405), and Gag (1-405). The plasmids whichexpress Gag proteins under the control of the T7 promoter are shown inTable 6. E. coli strain BL21 (DE3) is used as a host for the expression.Culturing of E. coli cells and analysis of the obtained proteins areconducted in substantially the same manner as in Steps 2 and 3.

Step 7 (Construction of plasmids capable of expressing Gag-Env fusionproteins under the control of the T7 promoter)

Plasmid pAS192 (see Step 1) is digested with BamHI, treated with T4DNApolymerase, and digested with ClaI, followed by agarose gelelectrophoresis. From the agarose gel, a fragment of about 310 b isrecovered. This fragment is cloned into plasmid pTG210 (see JapanesePatent Application Laid-Open Specification No. 4-117289) having beendigested with ApaI, treated with T4DNA polymerase and digested with ClaIto thereby obtain plasmid pGE33 (see Table 7). The obtained plasmidpGE33 is digested with HindIII and then subjected to agarose gelelectrophoresis. From the agarose gel a fragment of about 600 b isrecovered. This fragment is cloned into plasmids pTG110 and pTG591 (seeJapanese Patent Application Laid-Open Specification No. 4-117289) eachhaving been digested with HindIII and treated with BAP, to therebyobtain plasmids pGE1133 and pGE5633 (see Table 7), respectively. Theobtained plasmids pGE33, pGE1133, and pGE5633 express fusion proteinsGag(308-406)-Env(512-611), Gag(121-406)-Env(512-611), andGag(1-406)-Env(512-611), respectively.

The nucleotide sequence between the BamHI site and the HindIII site ofthe multicloning sites of plasmid pT7-7-1 (see Step 5) is replaced withthat of plasmid pUR292 (see Step 1) to thereby obtain plasmid pT7-29-1.The obtained pT7-29-1 is digested with BamHI, treated with T4DNApolymerase, and self ligated to thereby obtain plasmid pT7-29-14. Theabove-mentioned plasmid pGE33 is digested with HindIII to thereby obtaina fragment of about 0.6 kb, and this fragment is cloned into plasmidpT7-29-14 having been digested with HindIII and treated with BAP, tothereby obtain plasmid pGE2133.

Plasmid pAS182 (see Step 1) is digested with BamHI and ClaI to obtain afragment of about 590 b. This fragment is cloned into plasmids pGE2133and pGE1133 each having been digested with BamHI and ClaI, to therebyobtain plasmids pGE218 and pGE118, respectively (see Table 7). PlasmidspGE218 and pGE118 express fusion proteins Gag(308-406)-Env(512-699) andGag(121-406)-Env(512-699), respectively.

Plasmid pT7-7 (see Step 5) is digested with BglII, treated with T4DNApolymerase, and self ligated to thereby obtain plasmid pT7-7 (BglIIx).Plasmid pTG210 (see Japanese Patent Application Laid-Open SpecificationNo. 4-117289) is digested with NdeI and ClaI to obtain a fragment ofabout 1 kb. This fragment is cloned into plasmid pT7-7 (BglIIx) havingbeen digested with NdeI and ClaI to thereby obtain plasmid pTG21OX.

Plasmid pAS192 (see Step 1) is digested with BamHI and ClaI to obtain afragment of about 310 b. This fragment is cloned into plasmid pTG210Xhaving been digested with BglII and ClaI, to thereby obtain plasmidpGE31 (see Table 7). Plasmid pGE31 expresses fusion proteinGag(308-437)-Env(512-611).

Plasmids which express Gag-Env fusion proteins under the control of theT7 promoter are shown in Table 6. E. coli strain BL21 (DE3) is used as ahost for expression. Culturing of E. coli cells to attain large-quantityproduction of a fusion protein and analysis of the protein are conductedin substantially the same manner as in Steps 2 and 3. The total cellproteins of the E. coli cells which have produced fusion proteins shownin Table 7 are fractionated by SDS-PAGE, and the gels are stained withCBB. By scanning the gel with a densitometer, the proportions of theproduced fusion proteins to the total cell proteins are measured. Thegreatest proportion is exhibited with respect to the fusion proteinsexpressed by plasmids pGE33, pGE218, and pGE31, which is about 20%.

Step 8 (Confirmation of the reactivity of Gag-Env fusion protein withHIV antibodies)

Plasmids pGE33, pGE31, and pGE218 constructed in Step 7 express largequantities of Gag-Env fusion proteins in E. coli strain BL21 (DE3). Ofthese fusion proteins, the protein produced in an especially largequantity is Gag(308-406)-Env(512-611) fusion protein expressed by pGE33.In order to confirm the usefulness of this fusion protein as an antigenfor diagnosis, the reaction between the fusion protein and each of serataken from 41 HIV carriers (36 asymptomatic HIV carriers, 1 ARC, and 4AIDS patients) is investigated by conventional Western blotting insubstantially the same manner as described in Steps 2 and 3 (see Table8). As a result, it is found that Env antibodies can be detected in allof the 41 carriers, thus assuring the usefulness of the fusion proteinas an antigen for diagnosis.

Step 9 (Purification of the Gag(308-406)-Env(512-611) fusion protein)

250 ml of a culture of E. coli strain BL21 (DE3) which has produced theGag(308-406)-Env(512-611) fusion protein in large quantity, is subjectedto centrifugation at 5,000 rpm for 10 minutes to thereby harvest the E.coli cells. The harvested cells are suspended in 10 ml of a buffercontaining 50 mM Tris-HCl (pH7.5) and 10 mM 2-mercaptoethanol, and theresultant suspension is subjected to ultrasonication to thereby disruptthe cells. When the resultant lysate is centrifuged at 19,000 rpm for 30minutes, the Gag-Env fusion protein is contained in the precipitate. Thesupernatant is discarded, and the precipitate is suspended in 10 ml of abuffer containing 50 mM Tris-HCl (pH7.5) and 10 mM 2-mercaptoethanol. Tothe obtained suspension is added 5 ml of SDS-PAGE sample buffer (forsodium dodecyl sulfate-polyacrylamide gel electrophoresis), mixed well,and heated at 100° C. for 5 minutes. The heated mixture is centrifugedat 12,000 rpm for 5 minutes, and 2 ml (per batch) of the resultantsupernatant is applied to an SDS-PAGE gel of a Model 491 PrepCell(available from Bio-Rad Laboratories, U.S.A.) to carry outelectrophoresis at 40 mA. Chromatography is con- ducted at a flow rateof 1 ml/min and in a fraction size of 2.5 ml/frac. to thereby collect apeak fraction containing the Gag-Env fusion protein.

The peak fraction is concentrated about 20-fold, and the resultantconcentrate is subjected to SDS-PAGE, followed by staining with CBB.

As a result, it is found that the fusion protein is highly purified,with no other protein bands observed.

About 5 mg of purified Gag(308-406)-Env(512-611) fusion protein isobtained from one liter of E. coli culture.

Step 10 (Usefulness of the purified Gag-Env fusion protein as an antigenfor diagnosis)

The preparation of purified Gag(308-406)-Env(512-611) fusion proteinobtained in Step 9 is diluted, and the dilution is dotted onto apolyvinylidene difluoride membrane to obtain dots of the fusion proteinin amounts of 10, 20, 40, 80, 160, and 320 ng. The dots are individuallyblocked with skim milk, and reacted with sera from each of 55 HIVcarriers (in particular, 50 asymptomatic HIV carriers, 1 ARC and 4 AIDSpatients) and from 84 non-infected individuals (healthy individuals),the sera having been diluted 100-fold with a buffer containing 20 mMTris-HCl (pH 7.5), 150 mM NaCi and 0.05% Tween 20. Peroxidase-conjugatedgoat anti-human IgG (available from Bio-Rad Laboratories, U.S.A.) isused as a secondary antibody, and the color reaction is performed by thecustomary method. Results of the above dot blotting are shown in Table9.

As little as 20 ng of the fusion protein specifically reacts with allthe sera from 55 HIV carriers, and even 5 ng of the fusion proteinspecifically reacts with all the sera from the HIV carriers except 2asymptomatic HIV carriers. Neither specific reaction nor non-specificreaction is observed between as much as 320 ng of the fusion protein andthe sera from 84 healthy individuals. From these results, it is judgedthat the purified fusion protein exhibits extremely high specificity anda broad spectrum of seroreactivity, thereby ensuring the usefulness ofthe protein as an antigen for diagnosis.

Example 2

Step 1 (Production of highly purified Gag protein p55)

A culture of E. coli transformant BL21(DE3)/pTG581 having produced alarge quantity of Gag protein p55 is centrifuged at 5,000 rpm for 10minutes to thereby harvest the cells. The harvested cells are suspendedin a phosphate buffer containing 20 mM sodium phosphate (pH 6.9) and 10mM 2-mercaptoethanol, the volume of which is 1/50 that of theabove-mentioned culture, and the resultant suspension is subjected toultrasonication to thereby disrupt the cells. The resultant lysate iscentrifuged at 19,000 rpm for 60 minutes to obtain a supernatantcontaining p55. The supernatant is treated with 20% saturation ofammonium sulfate to thereby obtain a precipitate. The obtainedprecipitate is dissolved in a phosphate buffer as defined above butcontaining 8 M urea. The resultant solution is passed through a columnof S-Sepharose (manufactured and sold by Pharmacia Fine Chemicals AB,Sweden) equilibrated with the same phosphate buffer as mentioned above.Elution is carried out with the buffer having, added thereto, sodiumchloride, having a 0 to 1 M concentration gradient, thus obtaining p55fractions. The obtained p55 fractions are pooled. The pooled fractionsare dialyzed against a phosphate buffer as de- fined above butcontaining 300 mM sodium chloride, followed by centrifugation at 19,000rpm for 20 minutes. The resultant supernatant is passed through a columnof Heparin-Sepharose CL-6B (manufactured and sold by Pharmacia FineChemicals AB, Sweden) equilibrated with the above defined phosphatebuffer. Elution is performed with the buffer having, added thereto,sodium chloride, having a 0 to 1 M concentration gradient, to therebyobtain p55 fractions. The obtained p55 fractions are pooled and thenconcentrated. To the resultant concentrate is added a sample buffer forSDS-PAGE, and mixed well. The mixture is applied to an SDS-PAGE gel in aPrep Cell. Chromatography is performed under the same conditions asdescribed in Step 9 of Example 1. The resultant p55 fraction isconcentrated to about a 20-fold concentration, and the resultantconcentrate is subjected to SDS-PAGE, followed by staining with CBB. Itis found that p55 is highly purified, with no other protein bandsobserved. Step 2 (Reactivity of respective Gag proteins pl7, p24, andpl5 and the entire Gag protein, p55, with sera from HIV carriers, theGag proteins having been produced in large quantities by E. coli andhighly purified).

Highly purified HIV-1 Gag proteins pl7, p24, pl5 (see W091/18990), andp55 (see Step 1 of Example 2) are individually dotted onto apolyvinylidene difluoride membrane in substantially the same manner asdescribed in Step 10 of Example 1, and reacted with sera from 40 HIVcarriers, separately, (in particular, 35 asymptomatic HIV carriers, 1ARC and 4 AIDS patients) and from 10 non-infected individuals (healthyindividuals). A serum reaction and a coloring reaction are carried outin substantially the same manner as described in Step 10 of Example 1.Results of such reactions are shown in Table 10. Gag proteins p17, p24,and p15 detect specific antibodies in 92.5% (37/40), 87.5% (35/40), and85% (34/40) of the carriers, respectively. Gag protein p55 specificallyreacts with all of the sera from 40 HIV carriers, and the reactions arestronger than those of the p17, p24, and p15. It is especially notedthat p55 reacts with the serum from one asymptomatic HIV carrier, whichreacts with none of the Gag proteins p17, p24, and p15. The Gag proteinwhich exhibits the weakest reactivity is pl5.

With respect to Gag proteins pl7, p24, and p55, the reactivity with serafrom ARC and AIDS patients is weaker than that with sera fromasymptomatic HIV carriers. This phenomenon is not observed with pl5. Inall of the 10 healthy individuals, no reaction takes place.

From the above results, it is seen that the Gag protein p55 is mostexcellent as a Gag antigen for screening HIV infection.

                                      TABLE 1                                     __________________________________________________________________________    Plasmids for expression of LacZ-Env fusion proteins                                      Nt. no. of                                                                          Nt. no. of                                                        5' cloning                                                                          5' cloning                                                                          3' cloning                                                                          3' cloning                                             Plasmid                                                                            site* site* site* site* Product**                                        __________________________________________________________________________    pAS160                                                                             KpnI  6343  7031  BglII LacZ-Env(14-244)                                 pAS210                                                                             KpnI  6343  7611  BglII LacZ-Env(14-437)                                 pAS200                                                                             KpnI  6343  8131  HindIII                                                                             LacZ-Env(14-611)                                 pAS172                                                                             Stul  6822  7391  Scal  LacZ-Env(175-363)                                pAS220                                                                             HaeIII                                                                              6969  7834  HaeIII                                                                              LacZ-Env(224-510)                                pAS311                                                                             BglII 7031  7611  BglII LacZ-Env(244-437)                                pAS331                                                                             BglII 7031  8131  HindIII                                                                             LacZ-Env(244-611)                                pAS111                                                                             BglII 7031  8465  BamHI LacZ-Env(244-722)                                pAS131                                                                             BglII 7031  8887  XhoI  LacZ-Env(244-826)                                pAS342                                                                             BglII 7611  8131  HindIII                                                                             LacZ-Env(437-611)                                pAS122                                                                             BglII 7611  8465  BamHI LacZ-Env(437-722)                                pAS142                                                                             BglII 7611  8887  XhoI  LacZ-Env(437-826)                                pAS192                                                                             HaeIII                                                                              7834  8131  HindIII                                                                             LacZ-Env(512-611)                                pAS182                                                                             HaeIII                                                                              7834  8400  HaeIII                                                                              LacZ-Env(512-699)                                pAS351                                                                             HindIII                                                                             8131  8465  BamHI LacZ-Env(610-722)                                pAS151                                                                             HindIII                                                                             8131  8887  XhoI  LacZ-Env(610-826)                                pAS451                                                                             BamHI 8465  8887  XhoI  LacZ-Env(721-826)                                __________________________________________________________________________     *Nucleotide sequence and nucleotide number are according to GenBank data      file HIVNL43.                                                                 **Numbers in parentheses show amino acid numbers counted from the             Nterminus of the Env protein (gp160)                                     

                  TABLE 2                                                         ______________________________________                                        Reactivity shown by Western blotting of LacZ-Env fusion                       proteins with sera of three asymptomatic HIV-1 carriers                       Plasmid                                                                              Product        Serum A  Serum B                                                                              Serum C                                 ______________________________________                                        pUR290 LacZ           -        -      -                                       pAS160 LacZ-Env(14-244)                                                                             -        -      +                                       pAS210 LacZ-Env(14-437)                                                                             -        -      +                                       pAS200 LacZ-Env(14-611)                                                                             +        +      +                                       pAS172 LacZ-Env(175-363)                                                                            -        -      +                                       pAS220 LacZ-Env(224-510)                                                                            -        +      +                                       pAS311 LacZ-Env(244-437)                                                                            -        -      +                                       pAS331 LacZ-Env(244-611)                                                                            +        +      +                                       pAS111 LacZ-Env(244-722)                                                                            +        +      +                                       pAS131 LacZ-Env(244-826)                                                                            +        +      +                                       pAS342 LacZ-Env(437-611)                                                                            +        +      +                                       pAS122 LacZ-Env(437-722)                                                                            +        +      +                                       pAS142 LacZ-Env(437-826)                                                                            +        +      +                                       pAS192 LacZ-Env(512-611)                                                                            +        +      +                                       pAS182 LacZ-Env(512-699)                                                                            +        +      +                                       pAS351 LacZ-Env(610-722)                                                                            -        -      +                                       pAS351 LacZ-Env(610-826)                                                                            +        +      +                                       pAS451 LacZ-Env(721-826)                                                                            +        +      +                                       ______________________________________                                    

                  TABLE 3                                                         ______________________________________                                        Identified epitope regions on the Env protein                                         Serum A   Serum B     Serum C                                         ______________________________________                                        Identified                        Env(14-244)                                 epitope regions       Env(224-510)                                                                              Env(244-437)                                          Env(512-611)                                                                              Env(512-611)                                                                              Env(512-611)                                                                  Env(610-722)                                          Env(721-826)                                                                              Env(721-826)                                                                              Env(721-826)                                ______________________________________                                    

                  TABLE 4                                                         ______________________________________                                        Detection of Env antibodies in sera of HIV-1 carriers                         Antigen     Sera     +       ±  -     Total                                ______________________________________                                        LacZ-Env(512-611)                                                                         AC       39      -     -     39                                               ARC      1       -     -     1                                                AIDS     4       -     -     4                                    LacZ-Env(721-826)                                                                         AC       34      3     2     39                                               ARC      -       1     -     1                                                AIDS     1       3     -     4                                    ______________________________________                                    

                                      TABLE 5                                     __________________________________________________________________________    Plasmids for expression of Env proteins of HIV-1                                          Nt. no. of                                                                          Nt. no. of                                                        5' cloning                                                                          5' cloning                                                                          3' cloning                                                                          3' cloning                                            Plasmid                                                                             site* site* site* site* Product**                                       __________________________________________________________________________    pTE160                                                                              KpnI  6343  7031  BglII Env(14-244)                                     pTE210                                                                              KpnI  6343  7611  BglII Env(14-437)                                     pTE200                                                                              KpnI  6343  8131  HindIII                                                                             Env(14-611)                                     pTE172                                                                              Stul  6822  7391  Scal  Env(175-363)                                    pTE17-2                                                                             Stul  6822  7391  Scal  Env(175-363)                                    pTE220                                                                              HaeIII                                                                              6969  7834  HaeIII                                                                              Env(244-510)                                    pTE331                                                                              BglII 7031  7611  BglII Env(244-437)                                    pTE342                                                                              BglII 7611  8131  HindIII                                                                             Env(244-611)                                    pTS23 BglII 7611  8887  XhoI  Env(244-826)                                    pTE192                                                                              HaeIII                                                                              7834  8132  HindIII                                                                             Env(512-611)                                    pTE18-192                                                                           HaeIII                                                                              7834  8131  HindIII                                                                             Env(512-611)                                    pTE182                                                                              HaeIII                                                                              7834  8400  HaeIII                                                                              Env(512-699)                                    pTE18-182                                                                           HaeIII                                                                              7834  8400  HaeIII                                                                              Env(512-699)                                    pTS45 BamHI 8465  8887  XhoI  Env(721-826)                                    __________________________________________________________________________     *Nucleotide sequence and nucleotide number are according to GenBank data      file HIVNL43.                                                                 **Numbers in parentheses show amino acid numbers counted from the             Nterminus of the Env protein (gp160)                                     

Even if an appropriate plasmid is chosen, the expression of the Envprotein alone in a practically acceptable yield is found to bedifficult.

                                      TABLE 6                                     __________________________________________________________________________    Plasmids for expression of Gag proteins of HIV-1                                         Nt. no. of                                                                          Nt. no. of                                                        5' cloning                                                                          5' cloning                                                                          3' cloning                                                                          3' cloning                                             Plasmid                                                                            site* site* site* site* Product***                                       __________________________________________________________________________    pTG581                                                                             NdeI**                                                                              787   2429  BclI  Gag(1-500)                                       pEG581                                                                             NdeI**                                                                              787   2429  BclI  Gag(1-500)                                       pTG571                                                                             NdeI**                                                                              787   2096  BglII Gag(1-437)                                       pEG571                                                                             NdeI**                                                                              787   2096  BgIII Gag(1-437)                                       pTG561                                                                             NdeI**                                                                              787   2006  ApaI  Gag(1-405)                                       pTG581                                                                             NdeI**                                                                              787   1712  HindIII                                                                             Gag(1-309)                                       pTG541                                                                             NdeI**                                                                              787   1415  PstI  Gag(1-210)                                       pTG531                                                                             Ndel**                                                                              787   1247  NsiI  Gag(1-154)                                       pTG207                                                                             NdeI**                                                                              787   1415  PstI  Gag(1-132)****                                   pTG521                                                                             NdeI**                                                                              787   1145  PVuII Gag(1-119)                                       pTG121                                                                             PvuII 1145  2096  BglII Gag(121-437)                                     pTG110-2                                                                           PvuII 1145  2006  ApaI  Gag(121-405)                                     pTG212                                                                             HindIII                                                                             1712  2429  BclI  Gag(308-500)                                     PTG221                                                                             HindIII                                                                             1712  2096  BglII Gag(308K-437)                                    pTG210-2                                                                           HindIII                                                                             1712  2006  ApaI  Gag(308-405)                                     __________________________________________________________________________     *Nucleotide sequence and nucleotide number are according to GenBank data      file HIVNL43.                                                                 **NdeI site is introduced by in vitromutagenesis at the initiation codon      of the gaggene.                                                               ***Numbers in parentheses show amino acid numbers counted from the            Nterminus of the Gag protein (p55).                                           ****Termination codon is introduced at the first codon of p24, leading to     the expression of p17.                                                   

                  TABLE 7                                                         ______________________________________                                        Plasmids for expression of Gag-Env fusion proteins                                     Product                                                                         Gag protein region                                                                          Env protein region                                   Plasmid    (a.a.)        (a.a.)                                               ______________________________________                                        pGE216     308-406        14-244                                              pGE116     121-406        14-244                                              pGE221     308-406        14-437                                              pGE217     308-406       175-363                                              pGE117     121-406       175-363                                              pGE231     308-406       244-437                                              pGE131     121-406       244-437                                              pGE223     308-406       437-510                                              pGE123     121-406       437-510                                              pGE523      1-406        437-510                                              pGE2134    308-406       437-611                                              pGE1134    121-406       437-611                                              pGE5634     1-406        437-611                                              pGE271      1-119        437-611                                              pGE2112    308-406       437-722                                              pGE1112    121-406       437-722                                              pGE5612     1-406        437-722                                              pGE2142    308-406       437-826                                              pGE1142    121-406       437-826                                              pGE5642     1-406        437-826                                              pGE30      308-436       437-826                                              pGE33      308-406       512-611                                              pGE1133    121-406       512-611                                              pGE5633     1-406        512-611                                              pGE281      1-119        512-611                                              pGE31      308-437       512-611                                              pGE218     308-406       512-699                                              pGE118     121-406       512-699                                              pGE280      1-119        512-699                                              pGE34      308-406       721-826                                              pGE1145    121-406       721-826                                              pGE5645     1-406        721-826                                              pGE290      1-119        721-826                                              pGE32      308-435       723-826                                              ______________________________________                                    

                  TABLE 8                                                         ______________________________________                                        Western blotting of Gag-Env fusion protein with sera of                       HIV-1 carriers                                                                Antigen         Sera    +      ± -    Total                                ______________________________________                                        Gag(308-406)-Env(512-611)                                                                     AC      36     -    -    36                                                   ARC     1      -    -    1                                                    AIDS    4      -    -    4                                    ______________________________________                                    

                  TABLE 9                                                         ______________________________________                                        Reactivity of the purified Gag-Env fusion protein with                        sera of HIV-1 carriers and non-infected persons                               Sera         -       +       ++    +++   Total                                ______________________________________                                        AC           0       1       1     53    55                                   ARC          0       0       0     1     1                                    AIDS         0       0       0     4     4                                    Non-infected 84      0       0     0     84                                   individuals                                                                   (healthy                                                                      individuals)                                                                  ______________________________________                                         -: No reaction takes place with 320 ng of a purified fusion protein.          +, ++, +++: Reaction takes place with at least 20 ng, at least 10 ng, and     at least 5 ng of the purified fusion protein.                            

                  TABLE 10                                                        ______________________________________                                        Reactivity of the Gag proteins with serum antibodies of HIV-1 carriers        Reacted with at least                                                         Sera  320    160    80   40  20  10  5(ng)                                                                              Reacted                                                                              Tested                       ______________________________________                                        (A) Detection of anti-p55 antibodies in the sera from HIV-1 carriers          AC                       3   2   3   27   35     35                           ARC                              1        1      1                            AIDS                             3   1    4      4                            (B) Detection of anti-p17 antibodies in the sera from HIV-1 carriers          AC    1      5      3    8   11  5        33     35                           ARC                      1                1      1                            AIDS         2           1                3      4                            (C) Detection of anti-p24 antibodies in the sera from HIV-1 carriers          AC    4      2      4    6   9   3   3    31     35                           ARC                                       0      1                            AIDS  1      1      2                     4      4                            (D) Detection of anti-p15 antibodies in the sera from HIV-1 carriers          AC    1      2      8    16  3            30     35                           ARC                      1                1      1                            AIDS                1    2                3      4                            ______________________________________                                    

    __________________________________________________________________________    SEQUENCE LISTING                                                              (1) GENERAL INFORMATION:                                                      (iii) NUMBER OF SEQUENCES: 4                                                  (2) INFORMATION FOR SEQ ID NO:1:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 500 amino acids                                                   (B) TYPE: amino acid                                                          (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (vi) ORIGINAL SOURCE:                                                         (A) ORGANISM: Human immunodeficiency virus type 1                             (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                       MetGlyAlaArgAlaSerValLeuSerGlyGlyGluLeuAspLysTrp                              151015                                                                        GluLysIleArgLeuArgProGlyGlyLysLysGlnTyrLysLeuLys                              202530                                                                        HisIleValTrpAlaSerArgGluLeuGluArgPheAlaValAsnPro                              354045                                                                        GlyLeuLeuGluThrSerGluGlyCysArgGlnIleLeuGlyGlnLeu                              505560                                                                        GlnProSerLeuGlnThrGlySerGluGluLeuArgSerLeuTyrAsn                              65707580                                                                      ThrIleAlaValLeuTyrCysValHisGlnArgIleAspValLysAsp                              859095                                                                        ThrLysGluAlaLeuAspLysIleGluGluGluGlnAsnLysSerLys                              100105110                                                                     LysLysAlaGlnGlnAlaAlaAlaAspThrGlyAsnAsnSerGlnVal                              115120125                                                                     SerGlnAsnTyrProIleValGlnAsnLeuGlnGlyGlnMetValHis                              130135140                                                                     GlnAlaIleSerProArgThrLeuAsnAlaTrpValLysValValGlu                              145150155160                                                                  GluLysAlaPheSerProGluValIleProMetPheSerAlaLeuSer                              165170175                                                                     GluGlyAlaThrProGlnAspLeuAsnThrMetLeuAsnThrValGly                              180185190                                                                     GlyHisGlnAlaAlaMetGlnMetLeuLysGluThrIleAsnGluGlu                              195200205                                                                     AlaAlaGluTrpAspArgLeuHisProValHisAlaGlyProIleAla                              210215220                                                                     ProGlyGlnMetArgGluProArgGlySerAspIleAlaGlyThrThr                              225230235240                                                                  SerThrLeuGlnGluGlnIleGlyTrpMetThrHisAsnProProIle                              245250255                                                                     ProValGlyGluIleTyrLysArgTrpIleIleLeuGlyLeuAsnLys                              260265270                                                                     IleValArgMetTyrSerProThrSerIleLeuAspIleArgGlnGly                              275280285                                                                     ProLysGluProPheArgAspTyrValAspArgPheTyrLysThrLeu                              290295300                                                                     ArgAlaGluGlnAlaSerGlnGluValLysAsnTrpMetThrGluThr                              305310315320                                                                  LeuLeuValGlnAsnAlaAsnProAspCysLysThrIleLeuLysAla                              325330335                                                                     LeuGlyProGlyAlaThrLeuGluGluMetMetThrAlaCysGlnGly                              340345350                                                                     ValGlyGlyProGlyHisLysAlaArgValLeuAlaGluAlaMetSer                              355360365                                                                     GlnValThrAsnProAlaThrIleMetIleGlnLysGlyAsnPheArg                              370375380                                                                     AsnGlnArgLysThrValLysCysPheAsnCysGlyLysGluGlyHis                              385390395400                                                                  IleAlaLysAsnCysArgAlaProArgLysLysGlyCysTrpLysCys                              405410415                                                                     GlyLysGluGlyHisGlnMetLysAspCysThrGluArgGlnAlaAsn                              420425430                                                                     PheLeuGlyLysIleTrpProSerHisLysGlyArgProGlyAsnPhe                              435440445                                                                     LeuGlnSerArgProGluProThrAlaProProGluGluSerPheArg                              450455460                                                                     PheGlyGluGluThrThrThrProSerGlnLysGlnGluProIleAsp                              465470475480                                                                  LysGluLeuTyrProLeuAlaSerLeuArgSerLeuPheGlySerAsp                              485490495                                                                     ProSerSerGln                                                                  500                                                                           (2) INFORMATION FOR SEQ ID NO:2:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 826 amino acids                                                   (B) TYPE: amino acid                                                          (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: protein                                                   (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                       SerAlaThrGluLysLeuTrpValThrValTyrTyrGlyValProVal                              151015                                                                        TrpLysGluAlaThrThrThrLeuPheCysAlaSerAspAlaLysAla                              202530                                                                        TyrAspThrGluValHisAsnValTrpAlaThrHisAlaCysValPro                              354045                                                                        ThrAspProAsnProGlnGluValValLeuValAsnValThrGluAsn                              505560                                                                        PheAsnMetTrpLysAsnAspMetValGluGlnMetHisGluAspIle                              65707580                                                                      IleSerLeuTrpAspGlnSerLeuLysProCysValLysLeuThrPro                              859095                                                                        LeuCysValSerLeuLysCysThrAspLeuLysAsnAspThrAsnThr                              100105110                                                                     AsnSerSerSerGlyArgMetIleMetGluLysGlyGluIleLysAsn                              115120125                                                                     CysSerPheAsnIleSerThrSerIleArgAspLysValGlnLysGlu                              130135140                                                                     TyrAlaPhePheTyrLysLeuAspIleValProIleAspAsnThrSer                              145150155160                                                                  TyrArgLeuIleSerCysAsnThrSerValIleThrGlnAlaCysPro                              165170175                                                                     LysValSerPheGluProIleProIleHisTyrCysAlaProAlaGly                              180185190                                                                     PheAlaIleLeuLysCysAsnAsnLysThrPheAsnGlyThrGlyPro                              195200205                                                                     CysThrAsnValSerThrValGlnCysThrHisGlyIleArgProVal                              210215220                                                                     ValSerThrGlnLeuLeuLeuAsnGlySerLeuAlaGluGluAspVal                              225230235240                                                                  ValIleArgSerAlaAsnPheThrAspAsnAlaLysThrIleIleVal                              245250255                                                                     GlnLeuAsnThrSerValGluIleAsnCysThrArgProAsnAsnAsn                              260265270                                                                     ThrArgLysSerIleArgIleGlnArgGlyProGlyArgAlaPheVal                              275280285                                                                     ThrIleGlyLysIleGlyAsnMetArgGlnAlaHisCysAsnIleSer                              290295300                                                                     ArgAlaLysTrpAsnAlaThrLeuLysGlnIleAlaSerLysLeuArg                              305310315320                                                                  GluGlnPheGlyAsnAsnLysThrIleIlePheLysGlnSerSerGly                              325330335                                                                     GlyAspProGluIleValThrHisSerPheAsnCysGlyGlyGluPhe                              340345350                                                                     PheTyrCysAsnSerThrGlnLeuPheAsnSerThrTrpPheAsnSer                              355360365                                                                     ThrTrpSerThrGluGlySerAsnAsnThrGluGlySerAspThrIle                              370375380                                                                     ThrLeuProCysArgIleLysGlnPheIleAsnMetTrpGlnGluVal                              385390395400                                                                  GlyLysAlaMetTyrAlaProProIleSerGlyGlnIleArgCysSer                              405410415                                                                     SerAsnIleThrGlyLeuLeuLeuThrArgAspGlyGlyAsnAsnAsn                              420425430                                                                     AsnGlySerGluIlePheArgProGlyGlyGlyAspMetArgAspAsn                              435440445                                                                     TrpArgSerGluLeuTyrLysTyrLysValValLysIleGluProLeu                              450455460                                                                     GlyValAlaProThrLysAlaLysArgArgValValGlnArgGluLys                              465470475480                                                                  ArgAlaValGlyIleGlyAlaLeuPheLeuGlyPheLeuGlyAlaAla                              485490495                                                                     GlySerThrMetGlyCysThrSerMetThrLeuThrValGlnAlaArg                              500505510                                                                     GlnLeuLeuSerAspIleValGlnGlnGlnAsnAsnLeuLeuArgAla                              515520525                                                                     IleGluAlaGlnGlnHisLeuLeuGlnLeuThrValTrpGlyIleLys                              530535540                                                                     GlnLeuGlnAlaArgIleLeuAlaValGluArgTyrLeuLysAspGln                              545550555560                                                                  GlnLeuLeuGlyIleTrpGlyCysSerGlyLysLeuIleCysThrThr                              565570575                                                                     AlaValProTrpAsnAlaSerTrpSerAsnLysSerLeuGluGlnIle                              580585590                                                                     TrpAsnAsnMetThrTrpMetGluTrpAspArgGluIleAsnAsnTyr                              595600605                                                                     ThrSerLeuIleHisSerLeuIleGluGluSerGlnAsnGlnGlnGlu                              610615620                                                                     LysAsnGluGlnGluLeuLeuGluLeuAspLysTrpAlaSerLeuTrp                              625630635640                                                                  AsnTrpPheAsnIleThrAsnTrpLeuTrpTyrIleLysLeuPheIle                              645650655                                                                     MetIleValGlyGlyLeuValGlyLeuArgIleValPheAlaValLeu                              660665670                                                                     SerIleValAsnArgValArgGlnGlyTyrSerProLeuSerPheGln                              675680685                                                                     ThrHisLeuProIleProArgGlyProAspArgProGluGlyIleGlu                              690695700                                                                     GluGluGlyGlyGluArgAspArgAspArgSerIleArgLeuValAsn                              705710715720                                                                  GlySerLeuAlaLeuIleTrpAspAspLeuArgSerLeuCysLeuPhe                              725730735                                                                     SerTyrHisArgLeuArgAspLeuLeuLeuIleValThrArgIleVal                              740745750                                                                     GluLeuLeuGlyArgArgGlyTrpGluAlaLeuLysTyrTrpTrpAsn                              755760765                                                                     LeuLeuGlnTyrTrpSerGlnGluLeuLysAsnSerAlaValAsnLeu                              770775780                                                                     LeuAsnAlaThrAlaIleAlaValAlaGluGlyThrAspArgValIle                              785790795800                                                                  GluValLeuGlnAlaAlaTyrArgAlaIleArgHisIleProArgArg                              805810815                                                                     IleArgGlnGlyLeuGluArgIleLeuLeu                                                820825                                                                        (2) INFORMATION FOR SEQ ID NO:3:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 10 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                       TATGGCTAAG10                                                                  (2) INFORMATION FOR SEQ ID NO:4:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 12 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (iii) HYPOTHETICAL: NO                                                        (iv) ANTI-SENSE: NO                                                           (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                       AATTCTTAGCCA12                                                                __________________________________________________________________________

What is claimed is:
 1. A method for producing a substantially pure HIV-1Gag-Env fusion protein consisting of a Gag sequence fused at its C-terminus to the N-terminus of an Env sequence, wherein the fusionprotein has a sequence selected from the group consisting of:Gag (1-406)and Env (512-611), Gag (121-406) and Env (512-611) Gag (308-406) and Env(512-611) Gag (308-406) and Env (512-699), and Gag (308-437) and Env(512-611)lwhere the numbers for Gag refer to amino acid residues in SEQID NO: 1 and the numbers for Env refer to amino acid residues in SEQ IDNO:2, said the method comprising the steps of: (a) litigating a DNAsequence encoding said fusion protein in operable linkage to a T7promoter in a replicable expression plasmid, (b) transforming cells ofEscherichia coli BL21(DE3) strain with said ligated plasmid, (c)culturing said transformed cells under conditions to produce saidprotein ad and (d) substantially purifying said protein.
 2. The methodof claim 1, wherein the replicable expression plasmid is pT7-7.
 3. Areplicable recombinant plasmid comprising a T7 promoter operably linkedto a sequence encoding an HIV-1 Gag-Env fusion protein consisting of aGag sequence fused at its C-terminus to the N-terminus of an Envsequence, wherein the fusion protein has a sequence selected from thegroup consisting of:Gag (1-406) an d Env (512-611), Gag (121-406) andEnv (512-611) Gag (308-406) and Env (512-611) Gag (308-406) and Env(512-699), and Gag (308-437) and Env (512-611),where the numbers for Gagrefer to amino acid residues in SEQ ID NO: 1 and the numbers for Envrefer to amino acid residues in SEQ ID NO:
 2. 4. The plasticid of claim3 which is a pT7-7 construct.